kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity
نویسندگان
چکیده
منابع مشابه
kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity
Modern genomics techniques generate overwhelming quantities of data. Extracting population genetic variation demands computationally efficient methods to determine genetic relatedness between individuals (or "samples") in an unbiased manner, preferably de novo. Rapid estimation of genetic relatedness directly from sequencing data has the potential to overcome reference genome bias, and to verif...
متن کاملComparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus.
BACKGROUND De novo assembly of non-model organism's transcriptomes has recently been on the rise in concert with the number of de novo transcriptome assembly software programs. There is a knowledge gap as to what assembler software or k-mer strategy is best for construction of an optimal de novo assembly. Additionally, there is a lack of consensus on which evaluation metrics should be used to a...
متن کاملOptimizing k-mer size using a variant grid search to enhance de novo genome assembly
Largely driven by huge reductions in per-base costs, sequencing nucleic acids has become a near-ubiquitous technique in laboratories performing biological and biomedical research. Most of the effort goes to re-sequencing, but assembly of de novogenerated, raw sequence reads into contigs that span as much of the genome as possible is central to many projects. Although truly complete coverage is ...
متن کاملEstimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects
Background: With the fast development of next generation sequencing technologies, increasing numbers of genomes are being de novo sequenced and assembled. However, most are in fragmental and incomplete draft status, and thus it is often difficult to know the accurate genome size and repeat content. Furthermore, many genomes are highly repetitive or heterozygous, posing problems to current assem...
متن کاملInner Product Similarity Search using Compositional Codes
This paper addresses the nearest neighbor search problem under inner product similarity and introduces a compact code-based approach. The idea is to approximate a vector using the composition of several elements selected from a source dictionary and to represent this vector by a short code composed of the indices of the selected elements. The inner product between a query vector and a database ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PLOS Computational Biology
سال: 2017
ISSN: 1553-7358
DOI: 10.1371/journal.pcbi.1005727